Pythonでミニゲーム作成

作ろうとした物

Pythonが元々人工知能などを作れることを知っていた自分は某アメコミの鉄男さんの作中に出てくるAIもどきを作ってみようと思った。

まず制作にあたり、必要なものをリストアップした。

音声認識ツール

音声変換ツール

音声発声ツール

Xcode,etc.....

などなどである。Xcodeがリストに入ってるのはその作ったものをXcodeを通してiPhoneに入れられないかと考えたためである。
調べてみたところ、Pythonを使ってもXcodeを通してiPhone自体に入れることは可能とのことだったため採用した。
そして、制作開始をした。

とりあえずどの場面で役立つものにするかを考えて、コードを書いて。
残りは音声認識〜音声発声ツールだった。
まず、「Speech Recognition」と言うツールをダウンロードし、その上で音声認識エンジン/APIをダウンロードまたは利用しようと試みた。
自分は「Google Speech Recognition」を採用したのだが、ここで問題が発生した。
「 Speech Recognition」だけでは動かず、「Google Speech Recognition」だけでも動かず、その二つを繋ぐ「pyaudio」と言うソフトがあり、ダウンロードを試みたものの、まさかのサービスが終了していた。
その後他の「pyaudio」に代わる物を探したが発見できず、今回制作しようとしたものは没になってしまった。

※pyaudioがあった場合ダウンロードした後最初に実行しようとしてたコードを貼っておきます。


        
        from __future__ import division
        
        import re
        import sys
        
        from google.cloud import speech
        
        import pyaudio
        from six.moves import queue
        
        # Audio recording parameters
        RATE = 16000
        CHUNK = int(RATE / 10)  # 100ms
        
        class MicrophoneStream(object):
            """Opens a recording stream as a generator yielding the audio chunks."""
        
            def __init__(self, rate, chunk):
                self._rate = rate
                self._chunk = chunk
        
                # Create a thread-safe buffer of audio data
                self._buff = queue.Queue()
                self.closed = True
        
            def __enter__(self):
                self._audio_interface = pyaudio.PyAudio()
                self._audio_stream = self._audio_interface.open(
                    format=pyaudio.paInt16,
                    # The API currently only supports 1-channel (mono) audio
                    # https://goo.gl/z757pE
                    channels=1,
                    rate=self._rate,
                    input=True,
                    frames_per_buffer=self._chunk,
                    # Run the audio stream asynchronously to fill the buffer object.
                    # This is necessary so that the input device's buffer doesn't
                    # overflow while the calling thread makes network requests, etc.
                    stream_callback=self._fill_buffer,
                )
        
                self.closed = False
        
                return self
        
            def __exit__(self, type, value, traceback):
                self._audio_stream.stop_stream()
                self._audio_stream.close()
                self.closed = True
                # Signal the generator to terminate so that the client's

                # streaming_recognize method will not block the process termination.
                self._buff.put(None)
                self._audio_interface.terminate()
        
            def _fill_buffer(self, in_data, frame_count, time_info, status_flags):
                """Continuously collect data from the audio stream, into the buffer."""
                self._buff.put(in_data)
                return None, pyaudio.paContinue
        
            def generator(self):
                while not self.closed:
                    # Use a blocking get() to ensure there's at least one chunk of
                    # data, and stop iteration if the chunk is None, indicating the
                    # end of the audio stream.
                    chunk = self._buff.get()
                    if chunk is None:
                        return
                    data = [chunk]
        
                    # Now consume whatever other data's still buffered.

                    while True:
                        try:
                            chunk = self._buff.get(block=False)
                            if chunk is None:
                                return
                            data.append(chunk)
                        except queue.Empty:
                            break
        
                    yield b"".join(data)
        
        def listen_print_loop(responses):
            """Iterates through server responses and prints them.
        
            The responses passed is a generator that will block until a response
            is provided by the server.
        
            Each response may contain multiple results, and each result may contain
            multiple alternatives; for details, see https://goo.gl/tjCPAU.  Here we
            print only the transcription for the top alternative of the top result.
        
            In this case, responses are provided for interim results as well. If the
            response is an interim one, print a line feed at the end of it, to allow
            the next result to overwrite it, until the response is a final one. For the
            final one, print a newline to preserve the finalized transcription.
            """
            num_chars_printed = 0
            for response in responses:
                if not response.results:
                    continue
        
                # The `results` list is consecutive. For streaming, we only care about
                # the first result being considered, since once it's `is_final`, it
                # moves on to considering the next utterance.
                result = response.results[0]
                if not result.alternatives:
                    continue
        
                # Display the transcription of the top alternative.
                transcript = result.alternatives[0].transcript
        
                # Display interim results, but with a carriage return at the end of the
                # line, so subsequent lines will overwrite them.
                #
                # If the previous result was longer than this one, we need to print
                # some extra spaces to overwrite the previous result
                overwrite_chars = " " * (num_chars_printed - len(transcript))
        
                if not result.is_final:
                    sys.stdout.write(transcript + overwrite_chars + "\r")
                    sys.stdout.flush()
        
                    num_chars_printed = len(transcript)
        
                else:
                    print(transcript + overwrite_chars)
        
                    # Exit recognition if any of the transcribed phrases could be
                    # one of our keywords.
                    if re.search(r"\b(exit|quit)\b", transcript, re.I):
                        print("Exiting..")
                        break
        
                    num_chars_printed = 0
        
        def main():
            # See http://g.co/cloud/speech/docs/languages
            # for a list of supported languages.
            language_code = "en-US"  # a BCP-47 language tag
        
            client = speech.SpeechClient()
            config = speech.RecognitionConfig(
                encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
                sample_rate_hertz=RATE,
                language_code=language_code,
            )
        
            streaming_config = speech.StreamingRecognitionConfig(
                config=config, interim_results=True
            )
        
            with MicrophoneStream(RATE, CHUNK) as stream:
                audio_generator = stream.generator()
                requests = (
                    speech.StreamingRecognizeRequest(audio_content=content)
                    for content in audio_generator
                )
        
                responses = client.streaming_recognize(streaming_config, requests)
        
                # Now, put the transcription responses to use.
                listen_print_loop(responses)
        
        if __name__ == "__main__":
            main()